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STOCHASTIC MODELS OF EVIDENCE ACCUMULATION IN 
CHANGING ENVIRONMENTS 

ALAN VELIZ-CUBA*, ZACHARY P. KILPATRICROt, AND KRESIMIR JOSIC<>tt 


Abstract. Organisms and ecological groups accumulate evidence to make decisions. Classic 
experiments and theoretical studies have explored this process when the correct choice is fixed during 
each trial. However, we live in a constantly changing world. What effect does such impermanence 
have on classical results about decision making? To address this question we use sequential analysis 
to derive a tractable model of evidence accumulation when the correct option changes in time. Our 
analysis shows that ideal observers discount prior evidence at a rate determined by the volatility 
of the environment, and the dynamics of evidence accumulation is governed by the information 
gained over an average environmental epoch. A plausible neural implementation of an optimal 
observer in a changing environment shows that, in contrast to previous models, neural populations 
representing alternate choices are coupled through excitation. Our work builds a bridge between 
statistical decision making in volatile environments and stochastic nonlinear dynamics. 

Key words, mathematical neuroscience, decision making, dynamic environment, Bayesian 
inference, recursive Bayesian estimation, sequential probability ratio test, drift-diffusion model 


1. Introduction. To navigate a constantly changing world, we intuitively use 
the most recent and pertinent information. For instance, when planning a route 
between home and work we use recent reports of accidents and weather. We discount 
older information, as our environment is in constant flux: The clouds threatening rain 
last night may have dissipated, and an accident reported an hour ago has likely been 
cleared. The optimal strategy is therefore to weight recent evidence more strongly. 

How to make decisions in face of uncertainty and impermanence is a question 
that recurs in fields ranging from economics to ecology and neuroscience. Mam¬ 
mals [8,11,12,24], insects [13,38], single cells [3], and animal collectives [28] gather 
evidence to make decisions. However information about the state of the world is typ¬ 
ically incomplete and perception is noisy. Therefore, animals make choices based on 
uncertain evidence. The case of an observer deciding between two alternatives based 
on a series of noisy measurements has been studied extensively when the environ¬ 
ment is static [8,23,32,45]. In this case humans [36], and other mammals [11,24] can 
accumulate incoming evidence near optimally to reach a decision. 

Stochastic accumulator models provide a plausible neural implementation of de¬ 
cision making between two or more alternatives [4,43]. These models are analytically 
tractable [8], and can implement optimal decision strategies [10]. Remarkably, there 
is also a parallel between these models and experimentally observed neural activity. 
Recordings in animals during a decision task suggest that neural activity reflects the 
weight of evidence for one of the choices [24]. 

A key assumption in many models is that the correct choice is fixed in time, 
i.e. decisions are made in a static environment. This assumption may hold in 
the laboratory, but natural environments are seldom static [17,34]. Recent exper¬ 
imental evidence suggests that human observers integrate noisy measurements near 
optimally even when the state of the environment changes. For instance, when ob- 
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servers need to decide between two options and the corresponding reward changes in 
a history-dependent manner, human behavior approximates that of a Bayes optimal 
observer [5] . An important feature of evidence accumulation in volatile environments 
is an increase in learning rate when recent observations do not support a current es¬ 
timate [31]. Both behavioral and fMRI data show that human subjects employ this 
strategy when they must predict the position of a stochastically moving target [29]. 
Experimental work thus suggests that humans adjust evidence valuation to account 
for environmental variability. 

However, the dynamics of decision making in changing environments has not been 
fully investigated. To address this question we extend optimal stochastic accumulator 
models to a changing environment. These extensions are amenable to analysis, and 
reveal that an optimal observer discounts old information at a rate adapted to the 
frequency of environmental changes. As a result, the certainty that can be attained 
about any of the choices is limited. Our approach frames the decision making process 
in terms of a first passage problem for a doubly stochastic nonlinear model that can be 
examined using techniques of nonlinear dynamics. Extending previous work, we also 
identify accurate piecewise linear approximations to the nonlinear model. This model 
also suggests a biophysical neural implementation for evidence integrators consisting 
of neural populations whose activity represents the evidence in favor of a particular 
choice. When the environment is not static, optimal evidence discounting can be 
performed exactly by populations coupled through excitation. We also show that 
the computation can be well approximated by appropriately tuned classical linear 
population models [10,30,41,44]. 

2. Optimal decisions in a static environment. We develop our model in 
a way that parallels the case of a static environment with two possible states. We 
therefore start with the derivation of the recursive equation for the log-likelihood ratio 
of the two states, and the approximating stochastic differential equation (SDE), when 
the underlying state is fixed in time. 

To make a decision, an optimal observer integrates a stream of measurements 
to infer the present environmental state. In the static case, this can be done using 
sequential analysis [12,45]: An observer makes a stream of independent, noisy mea¬ 
surements, = (^ 1 , ^ 2 ) •••,■?«), at equally spaced times, The 

probability of each measurement, f+{^n) ■= and f-{^n) '■= Pi'(Cra|I?-), 

depends on the environmental state. Combined with the prior probability, Pr(iJj-), 
of the states, this gives the ratio of probabilities, 

Pr(g+|ei:n) _ /+(Ci)/+(6)---/+(g»)Pr(g+) 

” Pr(H_|Ci:„) /_(Ci)/_(6)---/-(en)Pr(i?-)’ 

which can also be written recursively [45]: 

where i?o = Pr(iJ+)/Pr(H_) can describe an observer’s prior belief about the proba¬ 
bility of the two choices. 

With a fixed number of observations, the ratio in Eq. (2.1) can be used to make 
a choice that minimizes the total error rate [32], or maximizes reward [23]. Eq. (2.1) 
gives a recursive relation for the log-likelihood ratio, = lnii„, 

2/„ = j/„_i-|-ln|^^|^. (2.2) 
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Fig. 3.1. Evidence accumulation in a changing environment. (A) The environmental state 
transitions from state H+ to H— and back with rates e+ and e_, respectively. Observations follow 
state dependent probabilities, /±(^) = (^) distributions of the measurements, 

change with the environmental state. Each individual observation changes the log-likelihood ratio, 
ln(Ln,-\-/Ln, — )- A single realization is shown. (C,D) The evolution of the continuous approximation 
of the log-likelihood ratio, y{t), (panel C) and the log probabilities x±(t) (panel D). At time t, 
evidence favors the environmental state if y(i) > 0? or, equivalently, if xj^(t) > x-{t). 


When the time between observations, At = t, — t^-i, is small, we can use the Func¬ 
tional Central Limit Theorem (p. 357 in [ 6 ]) to approximate this stochastic process 
by the stochastic differential equation (SDE) [8,35], 

dy = g±dt + p±dWt, (2.3) 

where Wt is a Wiener process, and the constants g± = ^E^[ln \H±] and = 
^Var 5 [ln |^^±] depend on the environmental state. Below we approximate other 
discrete time process, such as that given by Eq. (2.2), with SDEs. Details of these 
derivations are provided in the Appendix. 

In state we have g+At = /+(^) In j±||yd^. The drift between two ob¬ 

servations thus equals the Kullback-Leibler divergence between /_|_ and /_, i.e. the 
strength of the observed evidence from a measurement in favor of [14] . An equiv¬ 
alent interpretation holds for g-. Hence 5 + and g- are the rates at which an optimal 
observer accumulates information. We will use this observation to interpret the pa¬ 
rameters of the model in a changing environment. 

3. Two alternatives in a changing environment. We use the same assump¬ 
tions to derive a recursive equation for the log-likelihood ratio between two alternatives 
in a changing environment. The state of the environment, H(t), is or iJ_, but 
can now change in time (See Fig. 3.1A,B). When the environment is in one of these 
two possible states, the statistics of the observations are fixed. Observation statistics 
are therefore piecewise stationary in time. An observer infers the present state from a 
sequence of observations, ^i:„, made at equally spaced times, ti-n with At = tj — tj-i 
and characterized by probabilities /±(Cra) := Pi'(Cn|-f^±)- The state of the environ¬ 
ment changes according to a telegraph process (e. 5 ., p. 77 in [21]), and the probability 
of a change between two observations is e±At := Pr(iJ(t„) = Hzp\H{tn-i) = H±). 
We assume that and e_ are known to the observer. 

The probabilities, L„,± = Pr(iJ(t„) = H±\^i:n), then satisfy (See Appendix A): 

Ln,± OC f±{in) ((1 - Ate±)L„_i,± AtezpL„_i,^), (3.1) 
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Fig. 3.2. In a dynamic environment, the dynamics of the log-likelihood ratio, y, depends on 
the rates of switching between states. (A) When = 0, the environment is static, and the model 
reduces to the one derived in Section 2. (B) When then environment changes slowly, |ej-| <g 1, 
the log-likelihood ratio, y, can saturate, (C) In a rapidly changing environment, y tends not to 
equilibrate. fD ) When = 0 and e_ > 0, the task becomes a change detection problem. 


with proportionality constant Pr(^i:„_i)/Pr(^i:„). As in the static case, the ratio 
of the probabilities of the two environmental states at time can be determined 
recursively (See Appendix A), and equals 

n _ _ f+i^n) (1 ~ + Ate_ , , 

" “ + 1-Ate- ’ ^ ^ 

In this expression, the ratio of probabilities at the time of the previous observations, 
Rn-i, is discounted in a way that depends on the frequency of environmental changes, 
e±. This equation, and the continuum limits we discuss below, have been derived 
previously [16,50], but their dynamics were not analyzed. 

Eq. (3.2) describes a variety of cases of evidence accumulation studied previously 
(See Fig. 3.2): If the environment is fixed (e± = 0), we recover Eq. (2.1). If the 
environment starts in state i/_, changes to 7J+, but cannot change back (e_ > 0, e+ = 
0), we obtain 

n _ /+(^n) Rn-l + Ate_ 

1-Ate_ ’ 

a model used in change point detection [1,39,49]. 

We can again approximate evolution of ?/„ = lni?„, i.e. the stochastic process 
describing the evolution of the log of the likelihood ratio in Eq. (3.2), by an SDE: 


dy = [git) + e_(e ^ + 1) - e+ie^ + l)]dt + p{t)<lWt, 

' -V-' 

nonlinearity 


(3.3) 
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where g(i) = 
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H{t) 


, andp2(i) = ^Varj 
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/-(?) 


H{t) 


. Note that the 


drift and variance are no longer constant, but depend on the state of the environment 


H{t) at time t. We use 


m 


Hit) 


to denote the expectation of F{^) when ^ is 


drawn from the distribution associated with the current state Hit), i.e. /±(C) when 
Hit) = H±. In Appendix B we derive Eq. (3.3) as the continuum limit of the discrete 
process y„. 

As a consequence of the nonlinearity of Eq. (3.3) the state variable y(t) will not 
drift indefinitely when git) is fixed for some time interval t € [a, b]. Rather, trajectories 
will tend to accumulate about the single fixed point of the noise-free system (the 
case, pit) = 0). Importantly, more volatile environments (larger e±) correspond to 
fixed points that are closer to the midline y = 0, allowing for more rapid changes 
in sign [?/(<)]. The observer’s belief about the environmental state is encoded by the 
log-likelihood ratio, and changes at a rate related to the frequency of environmental 
changes. 

The nonlinear term in Eq. (3.3) does not appear in Eq. (2.3). It serves to discount 
older evidence by a factor determined by environmental volatility, i. e. the frequencies 
of changes in environmental states, e±. In previous work such discounting was modeled 
heuristically by a linear term [41-43], however our derivation shows that the resulting 
Ornstein-Uhlenbeck (OU) process is only an approximation of an optimal observer’s 
evidence accumulation process. 


3.1. Equal switching rates between two states. When e := = e_, the 

frequencies of switches between states are equal. Eq. (3.3) then becomes 


Ay = git)At — 2esinh(?/)dt -I- p(t)dWt. (3.4) 


The steepness of the function sinh( 2 /) at large values of y ensures that evidence is 
discounted more rapidly for large log-likelihood ratios than for small ones (Eig. 3.4F, 
below). As a result, evidence builds up faster when y is closer to zero, i.e. the 
observer is more uncertain. If we rescale time using r = et, the rate of switches 
between environmental states is unity. We obtain an equation for y^ := ?/(r/e): 

dj/r = [5('r)] dr - 2 sinh(j/^)dT -k [p(r)] dWr, (3.5) 

where gir) := git)/e = (?(T/e)/e and p(t) := p(r/e)/-ye. Recall that git) is the rate 
of evidence accumulation in the present state, and e~^ is the average time spent in 
each state. Hence, gir) = git)/e can be interpreted as the information gained over 
an average duration of the present environmental state. 

When observations follow Gaussian distributions, f± ~ Afi±p,,a‘^), then git) = 
±2p?/a^, p = 2p,/(T, and 

Ayr = sign[ 5 (r)]mdr — 2 sinh(j/T-)dT -k V2m AWr, (3.6) 

where m = 2p?/iu^e). Thus, the behavior of this system is completely determined by 
the single parameter m, the information gain over an average environmental epoch. 

We now analyze the results of two decision-making processes that utilize the log- 
likelihood ratio. Under the interrogation protoeol, the observer waits until a given time 
T = T and reports sign[yT-(r)] = ±1. Under the free response protocol, we assume 
that the observer uses a predetermined threshold, 0, waits until time t* at which the 
decision variable meets this threshold, |i/T-(r*)| = 6, and then reports sign [yrir*)]. For 
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Fig. 3.3. Dependence of the probability of the correct response (accuracy) on normalized 
information gain, m, in a symmetric environment. (A.) Accuracy in an interrogation protocol 
increases with m and interrogation time, t, but saturates. Horizontal bars on left indicate the 
accuracy when the environment is in a single state for a long time, as in Eq. (3.7). (^) When 
the observer responds freely accuracy is similar, but saturates at 1. The increase in accuracy with 
waiting time is exceedingly slow for low m. We fix eij = e for all i ^ j, gi = g, and set m = gfe = 20. 
(C) Accuracy in an interrogation protocol decreases with the number of alternatives N (See Section 
4 . for N > 2), saturating at ever lower levels. (D ) The free response protocol results in similar 
behavior, but the accuracy saturates at 1. The increase in accuracy with waiting time is exceedingly 
slow for higher numbers of alternatives N. 


Eq. (3.6), the probabilities of a correct response (accuracy) under both interrogation 
(Fig. 3.3A) and free response (Fig. 3.3B) protocols increase with m. When an 
optimal observer is interrogated about the state of the environment at time T, the 
answer is determined by the sign of the log-likelihood ratio, Ut- Since observers 
discount old evidence at a rate increasing with l/m, decisions are effectively based 
on a fixed amount of evidence, and accuracy saturates at a value smaller than 1 (Fig. 
3.3 A). On the other hand, accuracy arbitrarily close to 1 can be obtained in the free 
response protocol by increasing the threshold 9 (Fig. 3.3B). Equations for the case of 
multiple alternatives {N > 2) are provided in Section 4, and increasing N decreases 
accuracy for a fixed decision time (Fig. 3.3C,D). 

If the environment in Eq. (3.6) remains in a single state for a long time, the 
log-likelihood ratio, Ut, approaches a stationary distribution, 

S±{yT) = Jfexp , H{t) = H±, (3.7) 

where H{t) := H{T/e) and A is a normalization constant. Details on finding the 
stationary density of the Fokker-Planck equation associated with a nonlinear SDE 
such as Eq. (3.6) can be found in Ch.5 of [21]. The distribution, Eq. (3.7), is concen¬ 
trated around yr± = ±sinh~^ y, the hxed points of the deterministic counterpart of 
Eq. (3.6) obtained by setting Wr = 0. Since old evidence is continuously discounted. 
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the belief of an optimal observer tends to saturate. In contrast, no stationary distri¬ 
bution exists when e = 0, and the environment is static: Aggregating new evidence 
then always tends to increase an optimal observer’s belief in one of the choices. 

Since S± {y) is obtained by assuming that the environment is trapped in a single 
state for an extended time, S+{y)dy = S-{y)dy provides an upper bound on 
the accuracy (Fig. 3.3A). To achieve accuracy a in the free response protocol (Fig. 
3.3B), we require \yr\ > In [8]. While the threshold 0 = In that yr{T) must 
cross to obtain a specific accuracy, a, does not change with m, the time to reach this 
threshold increases steeply with a and decreases with m. This is partly due to the 
fact that for smaller m, environmental switches are rapid, causing frequent changes 
in the drift of 2/r('r) and keeping it close to the midline yr = 0. 

3.2. Linear approximation of the SDE. An advantage of Eq. (3.3) is that 
it is amenable to standard methods of stochastic analysis. We can find an accurate 
piecewise linear approximation to Eq. (3.3), although, for simplicity, we focus on 
Eq. (3.6). The piecewise OU process that models an observer that linearly discounts 
evidence has the form 

dj/T = b{sign[g{T)]mdT -|- V^mdWr) + Aj/rdr. (3.8) 

For Eq. (3.8) to be the continuum limit of a linear log-likelihood update process, the 
drift and diffusion need to be co-scaled by the common parameter b. We begin by 
focusing on a linear approximation of Eq. (3.6) with the same equilibria and local 

stability, obtained by setting A = —V m? -\- 4 and b = -f sinh”^ Individual 
realizations of Eq. (3.8) and Eq. (3.6) agree in quickly changing environments (Fig. 
3.4A, TO = 1), but are less similar in slowly changing environments (Fig. 3.4B, to = 
10; see also panel C). Thus, as observer performance improves, the nonlinear term in 
Eq. (3.6) becomes more important. Note that the corresponding drift-diffusion model, 
dyr = sign [g(r)] TOdr -I- \/2mdWr^ is qualitatively different as it lacks a restorative 
leak term. This difference becomes more pronounced as to increases (Fig. 3.4C). 

Eq. (3.8) can be integrated explicitly using standard methods in stochastic cal¬ 
culus [21]. Furthermore, the accuracies of both systems saturate to a value smaller 
than 1 in the interrogation protocol as the interrogation time increases (Fig. 3.4C). 

This linearized approximation can differ considerably from the full nonlinear 
model. For instance, in the interrogation protocol the performance of an ideal observer 
modeled by Eq. (3.6) increases with interrogation time (Eig. 3.4D), and accuracy ap¬ 
proaches 1 as TO diverges. In contrast, the accuracy of an observer that discounts 
evidence linearly limits to a value below 1 as to diverges. Indeed, this can be seen by 
employing the quasi-steady state approximation (fixing sign[g(T)] = 1), and comput¬ 
ing T{y)dy, where T{y) is the steady state distribution of the OU process given 

in Eq. (3.8) with A = —y/m'^ + 4 and b = ^sinh“^ to obtain 

t/„ := I T[y)dy = ) + lerf . 

and limm_,.oo Um = \ + yerf^ « 0.84 < 1. 

In general, there is a family of linear approximations to Eq. (3.6) given by 
Eq. (3.8), where A S (—oo,0]. However, the choice of A depends on the way we 
measure the quality of the approximation. Eor example, we may want to maximize 
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Fig. 3.4. Closest linear approximations of the nonlinear SDE, Eq. (3.6). Single realiza¬ 

tions of the nonlinear Eq. (3.6), linear approximation Eq. (3.8), and corresponding drift-diffusion 
model dyT = sign[g(r)]mdT + \/2mdlFr, in (A) a quickly changing environment (m = 1), and (B) 
a slowly changing environment (m = lOj. We used the same realizations of drift g{T), and noise 
Wt for all models. (C) In the interrogation protocol, accuracy increases faster in the nonlinear 
Eq. (3.6) than in the linear approximation Eq. (3.8). Accuracy eventually decreases in the drift 
model since all evidence is weighted equally across time. (D) In the limit t oo, accuracy satu¬ 
rates below unity in both the nonlinear model and linear approximation. The linear model discounts 
evidence sub-optimally, and hence performs worse. (E) Accuracy under the interrogation protocol 
with stopping time t = 1 for the linear model, Eq. (3.8), with leak X (blue ticks: Xhest, black tick: 
A = —2). Optimal X-values for the linear approximation (blue curves) result in accuracy that is 
very close to that of the optimal nonlinear model, Eq. (3.6) (red lines). (F) Plot of 2sinh(yr) 
demonstrating two possible linear approximations: most accurate linear approximation from panel 
E (blue), linearization of 2sinh{yr) at the origin (black). 


decision accuracy under the interrogation policy with a specific stopping time, or 
maximize accuracy under the free response policy. In general, we need numerical 
optimization methods to identify the A that provides the best linear approximation. 
Without loss of generality, we can fix 6 = 1 in Eq. (3.8), since the rescaling Zr = Ur/b 
preserves sign [zr] = sign [y^.] and eliminates b. Thus, we need only study the system 
dyr = sign mdr + V2m dWr + Aj/rdr. For a given m, there is a single value, 
A = Abest, that maximizes the accuracy of decisions after an interrogation at r = 1 
(Fig. 3.4E). However, there is a different Abest for each value of m. Interestingly, 
the best linear approximation has accuracy close to that of the nonlinear system. We 
note that the linear approximation at the origin (A = —2, see also Fig. 3.4F) did not 
perform well. Since the accuracy has saturated at r = I (Fig. 3.4C), the optimal 
value of A will not change significantly for larger interrogation times. 

Similarly, Abest will change with different thresholds under the free response pro¬ 
tocol, or with other measures of performance, such as reward rate [8]. In contrast, the 
nonlinear model given by Eq. (3.6), reflects the log-likelihood ratio exactly. Therefore, 
we can use this single model for any decision that can be made optimally using the 
log-likelihood ratio. 

4. Multiple alternatives in a changing environment. We next extend our 
analysis of evidence accumulation in changing environments to the case of multiple 
alternatives. With multiple environmental states. Hi {i = the optimal ob¬ 

server computes the present probability of each state (Fig 4.1 A) from a sequence of 
measurements, ^i:„. Measurements have probability fi{^n) '■= Pr(Cn|ffi) dependent 
on the states Hi [4,9] (Fig 4. IB). We assume that the state of the environment, H{t), 

































Fig. 4.1. Evidence accumulation with multiple choices in a changing environment. (A) The 
environment switches between N states (here N = S). Distributions /i($) = Pr(^\Hi) describing 
the probability of observation ^ in environmental state Hi (here N = 3). (^C,Dj Realization of the 
log-likelihood ratios (panel C): ln(Li/L 2 ), \n.(L 2 /L^), hi{L^/Li), and (panel D): InLi, lnL 2 , InLa. 


changes as a memory less process. A change from state j to i between two measure- 
merits occurs with probability e^At = Pr(i/(t„) = Hi\H{tn-i) = Hj) for i ^ j, and 
Pr(i7(t„) = = W) = 1- Ate,, (Fig 4.1A). 

We again use sequential analysis to obtain the probabilities = Pr(iJ(t) = 
Hi\^i-n) that the environment is in state Hi given observations The index that 
maximizes the posterior probability, 'i = argmaxj corresponds to the most prob¬ 
able state, given the observations Following the approach above, we obtain (See 
Appendix D): 


_ Pr(gl:„-l) 
Pr(6:n) 


/.&) 111 - E Ateji j Ln-i^i -f E AtCijLn-lJ 


Again after taking logarithms, Xn^i = lnL„ j, we can approximate the discrete stochas¬ 
tic process in Eq. (4), with an SDE: 


dx = g{t)dt -I- A(t)dW( -I- A(x)dt, 


(4.1) 


where the drift has components gi{t) = ^E^ [ln/i(C|^f(f)], A(t)A(t)^ = E(t) with 
entries = ^Cov^[In fi{^),\n fj{^)\H{t)], components of are independent 
Wiener processes, and Ki{x) = ~ ^ji)- The drift gi is maximized 

in environmental state Hi (Fig 4.1C,D). 

We can recover the case of two alternatives by setting N = 2 and exchanging the 
numbers in Eq. (4.1) with ± to obtain the approximating SDEs: 

d 2 ;± = [g±{t) + _ e±)]dt -f dW±, (4.2) 

where (WiWj) = Ei,(t) • t for i,j G {+,—}• We obtain Eq. (3.3) by setting y := 
x+ — X-. Note that since x± = ln(L±) are the log-likelihoods, y -.= x+ — X- = 
ln(L+/L_) is the log-likelihood ratio. Analogous expressions for the log-likelihood 
ratios = ln(Li/L, ) are derived in Appendix E. The matrix of these log-likelihood 
ratios quantifies how much more likely one alternative is compared to others (e.y.. 
Fig. 4.1C) [18]. 
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Fig. 4.2. Evidence accumulation with a continuum of choices. The observer infers the state 
of the environment, Hg, where 9 S [—1,1], and the state changes at discrete points in time. (A) 
In slowly changing environments, the distribution of the log probabilities, xg, can nearly equilibrate 
between switches (Solid line represents the true state of the environment at time t. For clarity, we 
show results of simulations without noise, Wg = OJ. (BJ In quickly changing environments, the dis¬ 
tribution does not have time to equilibrate between switches. (C) In slowly changing environments, 
the most probable state of the environment. Off) = argmaxga;g(t) (thin lines), fluctuates around the 
true value (thick line). (D) In quickly changing environments, Off) fluctuates more widely, as it is 
in a transient state much of the time. 


5. A continuum of states in a changing environment. Lastly, we consider 
the case of a continuum of possible environmental states. This provides a tractable 
model for recent experiments with observers who infer the location of a hidden, in¬ 
termittently moving target from noisy observations. Evidence suggests that humans 
update their beliefs quickly and near optimally when observations indicate that the 
target has moved [29]. 

Suppose the environmental state, H(t), intermittently switches between a contin¬ 
uum of possible states, Hg, where 9 G [a, &]. An observer again computes the proba¬ 
bilities of each state from observations, ^i:„, with distributions fe{Cn) ■= Pr(^ra|^f6»)- 
The environment switches from state 9' to state 9 between observations with tran¬ 
sition probabilities eee'd6*At := Pr(iL(t„) = H 0 \H{tn-i) = Hgi) for 9 ^ 9', and 
Pr(iL(t„) = He\H{tn-i) = Hg) = 1 — Ateg>gd9' (See Appendix F for details). 
From Eq. (4) the expression for the probabilities Ln,e = Pr(iL(t„) = Hg\^i,n) is 
derived in Appendix F, yielding: 

Ln,g = ^^^e'ed9'\ Ln-i,e + J Ategg'Ln-i^g'd9' 



We again approximate the logarithms of the probabilities, hiLn,e, by a temporally 
continuous process. 


dxg{t) = ge{t)dt -f dWe{t) + Kg{x{t))dt, (5.1) 

where, x = {xg)g^^,^ ge(t) = ^E^ [ln/e(^)|iL(t)], Wg is a spatiotemporal noise term 
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with mean zero and covariance function given by 

^98'it) = ^Cov4 [In feiO Mf9 ', 

andi4r6i(a;) = /„ —ee'e)d0'is an interaction term describing the discounting 

process. 

The drift ge(t) is maximal when 9 agrees with the present environmental state. 
The most likely state, given observations up to time t, is 9(t) = aigmaxgxe{t). 

In slowly changing environments, the log probability Xe{t) nearly equilibrates to 
a distribution with a well-defined peak between environmental switches (Fig. 4.2A). 
This does not occur in quickly changing environments (Fig. 4.2B). However, each 
logarithm, xg{t) approaches a stationary distribution if the environmental state re¬ 
mains fixed for a long time. The term Kg{x) in Eq. (5.1) causes rapid departure from 
this quasi-stationary density when the environment changes, a mechanism proposed 
in [29]. 

Even when the environment is stationary for a long time, noise in the observations 
stochastically perturbs the log probabilities, xg{t), over the environmental states. 
This leads to fluctuations in the estimate 0{t) of the most probable alternative (Fig. 
4.2C,D). Thus, as opposed to the case of a discrete space of N alternatives, the 
observer’s estimate of the most probable choice will change continuously, fluctuating 
about the continuum of possible alternatives. Unless changes are too rapid, the peak 
of the log probability distribution, 0(t), fluctuates around the true environmental 
state, and tracks abrupt changes in Hg{t). This is in line with recent observations in 
human behavioral data [22,29]. 

6. A neural implementation of an optimal observer. Previous neural 
models of decision making typically relied on mutually inhibitory neural networks 
[10,30,43,46], with each population representing one alternative. These models match 
the recorded neural activity and responses of monkeys performing two-alternative 
forced-choice decision tasks, where single trial stimuli have stationary statistics [24]. 
Even when reward rates are varied across trials, animals can adjust their behavior 
near-optimally from trial-to-trial in ways that are well captured by mutually inhibitory 
models [20]. Interestingly, these networks also provide a plausible model of decision¬ 
making in house-hunting honeybee swarms [33]. In previous studies, it has been 
shown that a single fixed point can be stabilized in linear population models, as long 
as the strength of mutual inhibition is weaker than the leak of individual popula¬ 
tions [8,10,43]. As we will show, a complementary approach in linear population 
models is to consider a mutually excitatory network, with arbitrary leak in individual 
populations. As with the linear approximations discussed above, such models perform 
suboptimal inference in changing environments, but can approach the performance of 
the ideal nonlinear discounting process given by Eq. (3.3). 

Optimal inference in dynamic environments with two states, and iJ_, can be 
performed by mutually excitatory nonlinear neural populations with activities (firing 
rates) and r_ evolving according to: 

dr+ = [I+{t) — ar_|_ -|- E+(r_ — r+)] dt + dW+, (6.1a) 

dr_ = [I-{t) — ar- + F_{r^ — r_)] dt -I- dlU_, (6.1b) 

where the transfer functions are F±{x) = —ax/2 + ez^e^ — e±, the mean input I±{t) = 

when H{t) = H± and vanishes otherwise, W± are Wiener processes representing 
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Fig. 6.1. Neural population models of evidence accumulation. (A) Two populations u± receive 
a fluctuating stimulus with mean I±; they are mutually coupled by excitation (circles) and locally 
affected by inhibition/leak (flat ends). When J+ > 0, the fixed point of the system has coordinates 
satisfying > X— as shown in the plots of the associated potentials. ) Taking €± ^ 0 in 
Eq. (6.1) generates a mutually inhibitory network that perfectly integrates inputs I± and has a flat 
potential function. (C) With TV = 3 alternatives, three populations coupled by mutual excitation can 
still optimally integrate the inputs rapidly switching between the fixed point of the system in 

response to environmental changes. 


the variability in the input signal with covariance defined as in Eq. (4.2) (See Appendix 
D). Thus, I±{t)dt + dW± represents the total input to population r±. When a > 0 
and sufficiently small, population activities are modulated by self-inhibition/leak, and 
mutual excitation (Fig. 6.1A). The parameter a determines the leak in each individual 
population, which depends on both the time constants and recurrent architecture of 
the local network [46] . The difference y = r+ — r_ evolves according to the SDE for 
the log-likelihood ratio, Eq. (3.3). In the limit of a stationary environment, e± —>■ 0, 
we obtain a linear integrator dr± = [I±dt -|- dW±\ — a{r^ + r_)dt/2, as in previous 
studies [8,30]. 

To show that the populations mutually excite each other, we set W+ = W- = 0, 
and study the dynamics in the vicinity of the fixed points of Eqs. (6.1). When the 
environment has not changed for a long time, Eq. (6.1) approaches a fixed point 
(f+, f_) with 


(r+,r_) = 


II + e-e - e+ y+ e+e^+ - e_ _ y+ 
a 2 ’ a 2 


when (t) = and /_ (t) = 0 and 


(r+,r_)= 


e_e y- — e+ ?/_ -|- e+e 


+ T’' 


2 


a 
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when J_|_ (t) = 0 and /_ (t) = , where 


y± = In 


±4 + £- - e+ 
2 e+ 


(±4+e--e+)2 , e- 

4ei + 


Note that by increasing (decreasing) a, the fixed points (r+, f_) move closer to (farther 
from) the origin (0, 0). To determine the sign of the coupling near these fixed points, 
note that the Jacobian matrix of (F_|_, F_) has the form: 


J(r+,r_) 


a/ 2 -e_e^--’'+ -a /2 + e_e’'--’'+ 

-a/2 + e+e^+-^- a/2 - e+e'^+-''- 


( 6 . 2 ) 


For e± > 0, taking a < 2min{e_e ^,e+e*'} will guarantee that the sign of the Jaco¬ 


bian matrix is 


in a region that contains the fixed point. This corresponds 


to a neural network with self-inhibition/leak and mutual excitation illustrated in Fig. 
6.1A. 


We can compare our results to previous studies of linear connectionist models [ 8 , 
10,43] by deriving a linear rate model that best accumulates evidence in changing 
environments. To do so, we focus on the best linear approximation of the log-likelihood 
ratio, given by Eq. (3.8). We have shown that when the coefficients of the linear 
models are tuned appropriately, their accuracy is remarkably close to that of the full 
nonlinear model (Fig. 3.4E). Assuming symmetric switching {e± = e), the following 
system describes a linear rate model that can be mapped to the linear Eq. (3.8): 


dr+ = [/+(t) — Kr+ + 7 r_] dt + dW+, (6.3a) 

dr_ = [/_ (f) — Kr_ -I- 7 r+] dt + dW-. (6.3b) 


Here k > 0 denotes the leak in each population’s activity, and 7 > 0 is the strength 
of mutual excitation between populations. Selecting I±(t) = Iq when H{t) = H± 
and zero otherwise, it can be shown that the system will tend to the quasi-equilibria 

(r+,r_) = ° —K ■ (k, 7 ) and „ ° ^ • ( 7 , k) in either case. Stability of these fixed 

— 7 ^ — 7 ^ 

points is given by the nonzero eigenvalue A = — (k -f 7 ) < 0, so these quasi-equilibria 
are always attractive. Note also that the reduced SDE for the difference y = r+ — r_ 
will take the form dy = [Id{t) — {K + j)y]dt + dW^, where Id{t) = /+(t) — I-{t) 
and Wd = W+ — JF_, which matches the form of Eq. (3.8). Thus, in addition to 
the large leak in mutually inhibitory networks [8,10,43], linear population networks 
with mutual excitation possess a stable fixed point for arbitrary leak k and mutual 
excitation 7 . Any particular decision task has an optimal A in Eq. (3.8). Thus, a 
linear neural network could be trained to learn this best evidence discounting rate if 
supplemented with a plasticity rule that properly tunes the excitation strength 7 . 

Returning to the nonlinear model given by Eq. (6.1), the dynamics is matched to 
the timescale of the environment determined by e±, and solutions approach stationary 
distributions if input is constant. The network’s dynamics is very sensitive to changes 
in inputs, a feature absent in population models with winner-take-all dynamics [48]. 
Even when e is small, Eq. (6.1) has a single attracting state determined by the mean 
inputs /^. We illustrate the response of the model to inputs using potentials (Fig. 
6.1A). In contrast to the single attractor of Eq. (6.1), mutually inhibitory models can 
possess a neutrally stable line attractor that integrates inputs (e± = 0, Fig. 6. IB) [27]. 
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We can extend our results to > 2. In [4], the reliability of motion information 
was assumed to vary during a trial, and the optimal model encoded the posterior 
probability distribution over the possible stimulus space. Here, we assume the true 
hypothesis, H{t), changes in time. For an arbitrary but finite number of possible al¬ 
ternatives, {Hi, ...,i?jv}, decisions can be performed optimally by neural populations 
(ri, ...,rAr) coupled by mutual excitation 


dri = 


h{t) -ar^ + Y^ F,j{rj - n) 


dt -I- dWi{t), 


(6.4) 


where the mean input is Ii{t) = when H{t) = Hi and 0 otherwise and the noise 
vector {dWi{t), = A(t)dWt describes input noise with A{t) defined as 

in Eq. (4.1). Population firing rates are again determined by inhibit ion/leak within 
each population and excitation between populations as described by the arguments 
of the firing rate function 


-^d(^) = -ctr/N + ey-e’’ - eji. 

In this case coupling between populations is again excitatory (Fig. 6. 1C). 

Note that, as in the case of TV = 2 alternatives, taking the limit of Eq. (6.4) as 
eij —> 0, we obtain linear integrators [30] 


dri = 


N 

Ii{t) -a'^Tj/N 


i=i 


dt -I- dWi{t). 


The nonlinear population rate models described by Eq. (6.1) and Eq. (6.4) react 
rapidly, but not instantaneously, to changes in their inputs. Recent evidence suggests 
that in monkeys the activity of single neurons in area LIP exhibits jumps, rather than 
a gradual increase as previously suggested [26]. Furthermore, the performance of rats 
and humans discriminating the direction of auditory click sequences can be optimally 
fit by a pulse-accumulating mechanism with zero noise [11]. However, the activity of 
a population of cells encoding behavior may still ramp upwards or downwards. 

7. Discussion. We have derived a nonlinear stochastic model of optimal ev¬ 
idence accumulation in changing environments. Importantly, the resulting SDE is 
not an OU process, as suggested by previous heuristic models [35,41,43]. Rather, 
an exponential nonlinearity allows for optimal discounting of old evidence, and rapid 
adjustment of decision variables following environmental changes. As a result, the 
certainty of an optimal observer tends to saturate, even if the environment happens 
to be stuck in a single state for long periods of time. 

We have made several assumptions about the model to simplify the derivations. 
Our ideal observer is assumed to be aware both of the uncertainty of their own mea¬ 
surements and about the frequency with which the environment changes. A more 
realistic model would require that a naive observer learn the underlying volatility of 
the environment. Modeling the case of initially unknown transition rates leads to 
hierarchical models that identify the location of change-points [47]. However, this 
approach quickly grows in computational complexity, since the probability of change 
points is determined by accounting for all possible transition histories [1]. We also 
assumed that changes in the environment follow a memoryless process. In more gen¬ 
eral cases, we would not be able to obtain a recursive equation for the probability 
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of a state. An ideal observer would have to use all previous observations at each 
step, rather than integrating the present observation with the posterior probability 
obtained with the previous observation. This process cannot be approximated by an 
SDE. 

Sequential sampling in dynamic environments with two states has been studied 
previously in special cases, such as adapting spiking models, capable of responding to 
environmental changes [16]. Likelihood update procedures have also been proposed 
for multiple alternative tasks in the limit eij —^ 0 [2,18], but their dynamics was 
not analyzed. A related case of a temporally changing context has also been exam¬ 
ined [40]. One important conclusion of our work is that m = g/e, the information gain 
over the characteristic environmental timescale, is the key parameter determining the 
model’s dynamics and accuracy. It is easy to show that equivalent parameters gov¬ 
ern the dynamics of likelihoods of multiple choices. This allows for a straightforward 
approximation of the nonlinear model by a linear SDE, which can be analyzed fully. 

Models of evidence accumulation are of interest in disciplines ranging from neu¬ 
roscience and robotics to psychology and economics. They can help us understand 
how decisions are made in cells, animals, ecological groups, and social networks. We 
presented a principled derivation of a series of nonlinear stochastic models amenable 
to stochastic analysis, and have used quasi-static approximations, first passage tech¬ 
niques, and dimensional analysis to examine their dynamics. Thus we have built a 
bridge between classic models in signal detection theory and nonlinear stochastic pro¬ 
cesses. Continuous stochastic models have been very useful in interpreting human 
decision making in static environments [8,24]. Dynamic environments offer a promis¬ 
ing future direction for theory and experiments to probe the biophysical mechanisms 
that underlie decisions. 
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APPENDIX 

In this appendix, we present the derivations for the probability update formulas and 
their approximations discussed in the main text. We begin by deriving the update 
expression for the probability ratio, i?„, in the case of two alternatives in a changing 
environment. The result is a nonlinear recursive equation. Subsequently, we show 
how to approximate the log-likelihood ratio, ?/„ = lni?„, using a SDE. To make the 
approximation precise, it is key to view the discrete equation for ?/„ as a family of 
equations parameterized by the time interval. At, over which each observation, is 
made [8]. Furthermore, we extend our derivations to multiple {N > 2) alternatives, 
and show that the log probability updates can be approximated by a nonlinear system 
of SDEs in the continuum limit. With the appropriate scaling of the probabilities, 
fiiO — Pr(^|i7i), we can make precise the correspondence between the discrete and 
continuum models of posterior probability evolution. Lastly, we present a derivation 
for the stochastic integro-differential equation that represents the log probability for 
a continuum of possible environmental states, 9 S [a, 6]. 

Note that throughout the appendix, we use notation involving a subscript At. 
This helps us define a family of stochastic processes indexed by the spacing between 
observations At = t„ — t„_i. For instance, /At,±(C) represents the probability of 
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an observation, in environmental state H± (or, in the language of statistics, when 
hypothesis H± holds). This probability changes with the timestep At. This approach 
allows us to properly take the continuum limit At —> 0. However, for simplicity we 
refrain from using this notation in the main text. Rather, we treat the limiting SDEs 
as approximations of discrete update processes. Also, we slightly abuse notation and 
write fi{C) = even when ^ is a continuous random variable. 

Appendix A. Likelihood ratio for two alternatives. We begin by deriving 
the recursive update equation for the probabilities := Pr(i7(t„) = 

associated with each alternative H±, where each observation (measurement), is 
made at time This is the probability that alternative H± is true at time tn, given 
that the series of observations has been made. Importantly, the underlying truth 
changes stochastically, and in a memoryless way, with transition probabilities given 
by eAt,± := Pr(iJ(t„) = H^\H{tn-i) = H±), so that Pr(i?(<„) = H±\H[tn-i) = 
H±) = 1 — eAi,±- We begin by examining the probability Ln,+ associated with the 
alternative i?+. Using Bayes’ rule and the law of total probability (Ch. 3 in [37]) we 
can relate the current probability, Ln,+ ) to the conditional probabilities at the time 
of the previous observation, tn-i- 


L 


n,+ — 


—— ^ Pr(ei:„|i7(t„) = H+,H{tn-l) = Hs) 

X Pr(i7(t„) = i7+,i7(t„_i) = 77,), 


marginalizing over the joint distribution for the current i7(t„) and previous 77(t„_i) 
environmental states. Next, we can apply the definition of the conditional probability 
Pr(77(7„) = 77+|i7(t„_i) = 77,) to write 

II Pr(ei:„|77(7„) = 77+,77(t„_i) = 77,) 


Pr(Cl:n) 


s—± 


X Pr(77(t„) = 77+|77(7„_i) = 77,)Pr(77(7„_i) = 77,). 


Furthermore, we can split the joint condition on the first term by using the fact 
that the probability of making an observation is independent of when we 

condition on the present state 77(7„) = 77+ of the environment, so Pr(^i:„|77(7„) = 
77+,77(7„_i) = 77,) = Pr(e„|77(t„) = 77+)Pr(ei:n-i|77(7„_i) = 77,) and 
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ra,+ — 


1 

Pr(Cl:n) 


^ Pr(e„|77(t„) = 77+)Pr(ei:„-i|77(7„_i) = 77,) 

S—± 


X Pr(77(t„) = 77+|77(7„_i) = 77,)Pr(77(7„_i) = 77,). 


Lastly, we apply Bayes’ rule to switch the order of Pr(^i:„_i|77(7„_i) = 77,), yielding 
terms involving 7„_i_, = Pr(77(7„_i) = 77,|^i:„_i). In addition, we use Pr(77(t„) = 
77+|77(t„_i) = 77+) = 1 - eAt,+ and Pr(77(t„) = 77+|77(7„_i) = 77_) = so that 


Ln,+ — 


Pr(a:n-l)Pr(en|g(^n) =g+) 

Pr(Cl:n) 


((1 — eAt,+ ) L„_i_+ + eAt,-Ln-i ,-), 


(A.l) 


where 7o,+ = Pr(77(7o) = 77+)- 

Similarly, we can obtain an update equation for the probability 7„ _ of the alter¬ 
native 77_ at time 


Ln — — 


Pr(ei:„-i)Pr(en|g(tn) = 77_) 

Pr(Cl:n) 


i^At, + Ln-l,+ + (1 ~ £At,-)Ln-l,-) , (A.2) 
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where Lq- = Pi{H{to) = H-). 

From Eqs. (A.l) and (A.2), the ratio _ is readily seen to satisfy 

the recursive equation 

n _ /At,+ (^n) (1 — eAt, + )-Rn-l + CAt,- f \ o\ 

An ~ r \ p I 1 ’ (A.oj 

JAt,-[<;n) eAt, + A„_i + i — CAt,- 


where /At,±(^n) = P'>^{^n\H{tn) = H±) is the distribution for each choice parameter¬ 
ized by the timestep At = — tn-i, and Rq = ■ 

Appendix B. The continuum limit for the log-likelihood ratio of two 
alternatives. In this section, we derive a continuum equation for the log-likelihood 
ratio Un ■= lnA„. We will proceed by first defining a family of stochastic difference 
equations for 2 /„, which are parameterized by the timestep, At = t„ — tn-i, between 
pairs of observations. By choosing an appropriate parameterization, we obtain a 
continuum limit that is a SDE. To begin, we divide both sides of Eq. (A.3) by A„_i 
and take logarithms to yield 


Vn 


Vn-l 


, fAt,+ {in) , , 1 — CAq-l- + EAt.-e ^ 

In -TTVW + In -- 

/At,-(^n) 1 ~ EAt,- + eAtj + e^""^ 


(B.l) 


Following [7, 8] , we assume that the time interval between individual observations. 
At, is small. Denote by Ay„ = y-n — Vn-i the change in the log-likelihood ratio due 
to the observation at time t„. By assumption, the probability that the environment 
changes between two observations scales linearly with At up to higher order terms, so 
that eAt,± := Ate± -I- o(At). Omitting higher order terms in At, Eq. (B.l) can then 
be rewritten as 

Ayn = + ln(l + Ai(-e+ + - ln(l + At(-e_ + e+e^"-^)). 


Since we assumed At ^ 1, we can use the approximation ln(l + a) « a which is 
valid to linear order in |a| 1. We also assume that the change in the log-likelihood 

ratio, Ayn, is small over the time interval At, so can be replaced by j/„ on the 
right-hand side of the equation. We obtain 


Ayn 


In -b At{e_{e-y- + 1) - e+(l + 


= E, 


/At -(Cn) 

fAt,+ {^n) 


In 


/At -(^n) 


H{tn) 


1 /At,-t-(^?i) p 


In 


fAt.+ iin) 


fAt-iin) 
+ At(e_(e-^" + l)-e+(l + e^")), 


HiU, 


(B.2) 


where we have conditioned on the state of the environment, iF(t„) = H± at time t„. 
Replacing the index n, with the time t, we can therefore write 

Ayt « AtgAtit) + \/AipAt{t)g + At(e_(e“*'‘ + 1) - e+(l -b e^*)), (B.3) 


where g is random variable with standard normal distribution, and 


9Atit) ■■= ^Ef In 

PAtit) ■= ^Var^ 


/At.+ (C) 

/At,-(0 
/At.+ (0 


In 


/a-(0 


H{t) 
Hit) 


(B.4) 
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As before, 


E, 


m 


H{t) 


is the expectation of F{^) when ^ is drawn from the dis¬ 


tribution f±{^) associated with the current state H{t) = H±. Clearly, the drift 
and variance p\^ will diverge or vanish unless fAt,±{0 scaled appropriately in 
the At —>■ 0 limit. We discuss different ways of introducing such a scaling in the next 
section. 

Assuming that we have well-defined limits g(t) := limAt->.o and p^(t) := 

limAi-s-oPAi(0j the discrete-time stochastic process, Eq. (B.3), approaches the SDE 


dy = g{t)dt + p{t)dWt + {e-{e -I-1) - e+(l-I-e^))dt, (B.5) 


where Wt is a standard Wiener process. This limit holds in the sense of distributions. 
Roughly, the smaller At is, the closer the distributions of the random variables 2 /„ and 
y{tn) whose evolutions are described by Eq. (B.l), and Eq. (B.5), respectively. This 
correspondence can be made precise using the Donsker Invariance Principle (p.520 
in [6]). 

In sum, Eq. (B.5), can be viewed as an approximation of the logarithm of the 
likelihood ratio whose evolution is given exactly by Eq. (A.3). For a fixed interval At, 
the parameters of the two equations are related via Eq. (B.4), and eAt,±/At = e±. 

Appendix C. Precise correspondence. We now discuss two approaches in 
which the correspondence between Eqs. (B.l) and (B.5) can be made exact. We 
choose a specific scaling for the drift and variance arising from each observation, 
Suppose that over the time interval At, an observation, is a result of rAt separate 
observations - for example the measurement of the direction of rAt different moving 
dots [24]. In this case the estimate of the average of the individual measurements - 
e.g., the average of the velocities of dots in a display - will have both a mean and a 
variance that increase linearly with At. 

As a concrete example we can compute g{t) and p{t) in SDE (B.5) when obser¬ 
vations, follow normal distributions with mean and variance scaled by At, 


fAt,±{0 = 


1 


\/27rAt(T^ 


„-(5-AtA.±)V(2At<T^) 


Using Eq. (B.4) it is then straightforward to compute [7,8], 


and note that g{t) S {g+,g-} is a telegraph process {e.g., p. 77 in [21]) with the 
probability masses P{g±,t) evolving according to the master equation Pt{g±,t) = 
=Fe+P(( 7 +,t) ± e-P{g-,t). In this case p^{t) = p^ remains constant. 

More generally, we can obtain an identical result by considering that each obser¬ 
vation made on a time interval consists of a number of sub-observations, each with 
statistics that scale with the length of the interval and the number of sub-observations. 
We define a family of stochastic processes parameterized by k, the number of sub¬ 
observations made in an interval of length At. As above, when k = 1, we assume that 
an observation is the result of rAt separate observations. Assuming r is large, note 
that for fc > 1 each of the k subobservations contain roughly = [rAt/k\ obser¬ 
vations with mean and variance that scale linearly with oc At/k. We can achieve 
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this by approximating In in Eq- (B.2) by the family of stochastic processes 

parameterized by k [8] 


^ 1 /+(Cz) ^ 

h ^ it ^ 



f+M 

/-( 6 ) 


-E^ 


In 


MO 

MO 



The scaling in this approximation guarantees that the drift is given by the limit 


q(t) = lim qAt(t) = lim -^Ee 

' Ai^O^ ^ ^ At^o At ^ 


In 


JaMO 


and the variance 


1 


p^(t) = lim p\f = lim ^Varj 

^ At-s-o'^^* At^o At ^ 


In 


/aMO 

IaMO 


H{t) 




JaMO 

Furthermore, as fc —>■ oo, by the Central Limit Theorem, 


= Ef 


= Varj 


In 


MO 


In 


MO 

MO 


MO 


H{t) 


H{t) 


1^1 ^ 


MO) 


Qm ( /+( 6 ) _ p 

^ i /_( 6 ) ^ 


In 


U)0 

MO 



+ At(e_(e ^* + 1) — e+(l + e*'*)) 


converges in distribution to 

Ayt « Atgit) + y/Atp{t)g + At(e_(e“*'* + 1) - e+(l + e^*)), 

where 77 is a standard normal random variable. Taking the limit At —>■ 0 yields 
Eq. (B.5). When observations follow Gaussian distributions, f± ~ A/’(± 7 i, cr^), then 
g{t) = ja^^ p = ^p.l(y^ and 

dy = [git) + (e_(e“^ + 1) — e+(l + e*'))] dt + pdW, 

where dW is a standard white noise process. 

Appendix D. Continuum limit for log probabilities with multiple al¬ 
ternatives. We now describe the calculation of the continuum limit of the recursive 
system defining the evolution of the probabilities = Pr(iJ(t„) = Lti|^i:n) of one 
among multiple alternatives (environmental states), Hi^ * = 1, The state of the 
environment, and equivalently the correct choice at time t, again change stochasti¬ 
cally. We assume that the transitions between the alternatives are memoryless, with 
transition rates eAt,ij '■= = Hi\Hitn-i) = Hj). Using Bayes’ rule and rear¬ 

ranging terms (analogous to the derivation of Eqs. (A.l) and (A. 2)), we can express 
each probability in terms the probability at the time of the previous observation, 

Tyi— 1 ,J , 

Since we are only interested in comparing the magnitude of the probabilities, we can 
drop the common prefactor ^ 1 and use the fact that ^At,ji = 1 (since 

^At,ij is a left stochastic matrix) to write CAt^u = 1 — Sj/i ^At,ji and obtain 


XI ^At 




— fAt,i{,^n) 


1 X/ ^At,ji 


-^n— 1,2 


X 

3^^ 




(D.l) 
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where /At,z(Cn) = ^^{in\Hi,tn)- From Eq. (D.l), it follows that log of the rescaled 
probabilities, Xi := InL^, satisfies the recursive relation 

Xji^i ^n—1,2 — 111 f ln I 1 ^ ^ ^ ^ ^ ^At,ij^ 

\ i/* j¥^i 

To derive an approximating SDE, we denote by Axn^i = Xn^i — Xn-i,i, the change 
in the log probability due to an observation at time tn- As before, we assume eAt,ij := 
Atcij + o{At) for i ^ j, and drop the higher order terms, giving 

Axn,i = ln/At, 2 (^„) + In I 1 - ^ ^ 

\ i/* 

Assuming At <C 1, we again use the approximation ln(l + a) « a for |a| <C 1. We 
also assume that the change in the log probability, |Aa;„^i| <C 1, is small over the time 
interval At, so that 

Axn^i ~ In/ a^, 2 (^ 22 ) At 'y ^ 

=E{ [lnfAt,i{0\H{tn)] + (ln/At, 2 (Cn) - [ln/At, 2 (C)l^(^n)]) 

+ At^(e2,e"--""'^-e,2), (D.2) 

where we condition on the current state of the environment H(tn) G {F^i, 

Replacing the index n, by the time t, we can therefore write 

Axt,^ « AtgAt,i{^) + (eye^*-"““‘'* - eji) , (D.3) 

where ly^’s are correlated random variables with standard normal distribution 

9AtAt) ■= ^E^ [ln/At,i(OI^^(0] 

PAtA^) ■■= ^Var^ [In/At, 2(01 ■ 

The correlation of ty/s is given by 

CoiTAVt,Vj] := Corrj [ln/At,i(/), ln/Af,j(OI ■ 

Note that Eq. (D.3) is the multiple-alternative version of Eq. (B.3). Equivalently, we 
can write Eq. (D.3) as 

Axt^i ~ AtgAt,i{t) + A AtWAt,2 + At ^ ^ — eji) , 

where ITAt := (WAt,!, • ■ •, WAt.Ar) follows a multivariate Gaussian distribution with 
mean zero and covariance matrix EAt given by 

^At,ij = fAtAOM fAtjiOl H{t)] . 
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Finally, taking the limit At —>■ 0, and assuming that the limits 


9i{t) ■= 


lim 
— ^0 




and 'Eij (t) 


lim 
—>^0 


(i) , 


(D.4) 


are well defined, we obtain the system of SDEs 

dxj = g^it)dt + dWi(t) + ^ - eji) dt, (D.5) 


or equivalently as the vector system 


dx = g{t)dt + A(t)dWt + K{x)dt, 


where g(t) = (gi(t),..., A{t)A{t)'^ = E(t) are defined using the limits in 

Eq. (D.4), Ki{x) = — £ji), and the components of are independent 

Wiener processes. We can recover Eq. (B.5) by taking N = 2, letting y = xi — X 2 , 
and exchanging the indices 1 and 2 with + and —, respectively. 

As in the case of two alternatives, Eq. (D.5) can be viewed as an approximation 
of the logarithm of the probability whose evolution is given exactly by Eq. (D.l). Eor 
a fixed interval At, the parameters of these equations are related via Eq. (D.5), and 
^At,ij / At — ^ij • 

The limits g^{t) := limAt->o ffAt.i(0 and Eij(t) := limAt->o (t) are defined 
when the statistics of the observations scale with At. As we argued above, this can 
be obtained by considering observations drawn from a normal distribution with mean 
and variance scaled by At: 


fAt.iiO 


^ p-(g-At/x.)V(2Ater^) 

\/27rAtcr2 


Alternatively, the required scaling can also be obtained when each observation made 
on a time interval consists of a number of sub-observations, (^i,... ,^fc), with mean 
and variance scaled by To do so we approximate ln/Aty(^„) in Eq. (D.2) by 


E w + E ^ (ln/.(6) - Ee [InMmim ■ 

1=1 1=1 


Appendix E. Log-likelihood ratio for multiple alternatives. We can 

also derive a continuum limit for the log-likelihood ratio for any two choices i,j G 
{1,2,..., A}. From Eq. (D.l), the likelihood ratio Rn,ij = Ln^ijLnj. We note that 
this will provide us with a matrix of stochastic processes. We start with the recursive 
equation 


fAt,i{in) J2k^i^At,ki^ Rn-l,ij + J2k^i^At,ikRn-l,kj 
/Atj (Cra) 1 ~ 'Yhk^j ^At,kj + 'Yhk^j ^At,jkRn-l,kj 

We can thus derive the continuum equation for the log-likelihood ratio yn,ij ■= 
InRn^ij, as we did in the case of two alternatives. Since yij(t) is the difference 
yij{t) = Xi{t) — Xj(t), from Eq. (D.5) we obtain 

dj/ii = {giit)-gj{t))dt+dWi{t)-dWj{t)+y^^ (cifee*""’ - efcj)dt-E (^ifee*"'^ - f^kj)dt, 

k^i k^j 
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or 


^Vij = 




^ ^ ^kj ^ ^ ^ki ^ ^ ^ ^ ^jk^ 

k^j k^i k^i k^j 


Vkj 


dt + dW^j, (E.l) 


where gij (t) = Ej 


1 fid) 

/.(«) 


H (t) and VE is a Wiener process with covariance matrix 


given by Covj 


Wu,We 


Eq. (E.l) in vector form 


Hit) 


= Cov^ 


In AM In AM 


Hit) 


. We can also write 


dy = gdt + A(t)dWt + K(y)dt, 


where Ky(y) = 7 J2k^^^k^ + E/c^z e*fce^''- - Efc/j A(t)A(t)^ = 

S(t) is the covariance matrix, and the components of Wj are independent Wiener 
processes. 

Appendix F. Log probabilities for a continuum of alternatives. Finally, 
we examine the case where an observer must choose between a continuum of hypothe¬ 
ses He where 6 S [a, 6 ]. Thus, we will first derive a discrete recursive equation for 
the evolution of the probabilities Ln^e = Pr(A(t„) = Hg\^i.n). The state of the envi¬ 
ronment, the correct choice at time t, again changes according to a continuous time 
Markov process. We define this stochastically switching process through its transition 
rate function €At,ee', which is given for 0' ^ 9 a,s 



eAt,ee'd0 := Pr G H[g^^g^] \ = Hg,), 


where H[g^^g^] is the set of all states Hg with 9 in the interval [ 0 i, 02 ]- Thus, €At,ee' 
describes the probability of a transition over a timestep, At, from state Hgi to some 
state Hg, with 9 G [0i,6*2]. This means that Pr(i/(t„) = Hg\Hitn-i) = Hg) = 
1 — CAqe'gd^'. As in the derivation of the multiple alternative 2 < N < 00 case, 
we can express each probability Ln^g at time in terms of the probabilities Ln-ifi' 
at time tn-i, so 


Ln,e = ^p^^;"~;^ Pr(en|g(tn) = Hg) 
Pr(Cl:n) 


X |^Pr(i7(t„) = = He)Lr,.i,g + CAt.ee'An-i.e'd0'^ . 


Notice that the sum from the N < 00 case, as in Eq. (D.l), has been replaced with 
an integral over all possible hypotheses Hgf, 9' G [a,b] and a term corresponding to 
the probability of the environment not changing. Again we drop the common factor 
^Pitf ^ , since we wish to compare the magnitudes of the probabilities. We obtain 


Ln,6 — fAt,oi^n) 


1 - 


CAt.e'sdd' 


Ln- 1,9 + / SAt,9d'Ln-lM'd9' 


(E.l) 


where fAtfti^n) = Pr(^n|A(t„) = Hg). From Eq. (F.l), we can thus derive a recursive 
relation for the log of the rescaled probabilities Xn ,9 '■= InL^^g in terms of Xn-i,e so 


~ — In/At,e(^n) + In I 1 — / CAt.e'eddM / 

\ J a J a 
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To approximate this discrete-time stochastic process with a SDE, we denote by 
= Xn^e — Xn-ifi, the change in log probability due to the observation at time 
tn- Furthermore, we assume eAt,ee' '■= Ategg/ -|- o{At) and drop higher order terms, 


C rb I'b N 

1 - / Ate0'0<19' + / 

J a J a t 


Assuming At <C 1, we again use the approximation In(l-l-a) « a for |a| <C 1. Assuming 

\AXn,0\ < 1 , 

Ax„,e « ln/At,e(C„) + At f - eg-g) d6»' 

J a 

= [fAtAO\H{tn)] + (ln/At.e(C„) - Ej [ln/At,e(C)|-ff(in)]) 

+ At f (^e00'e^"-‘''~^"'^ — €010) d9', 

J a 

conditioned on the current state of the environment H(tn) = H^p where (p S [a, 6]- 
Exchanging the index n with the time, t, we can therefore write 

fb 

Axt,0 ^AtgAt,e{t) +'/MpAt,e{t)r]0 + At / {e00>e^^’O'~^* '> - e0>0) dO', (F.2) 

J a 

where rye’s are correlated random variables which marginally follow a standard normal 
distribution, and 

5At.e(t) := ^Ej [ln/At,e(C)I^W] > 

PAt,eW := ^Var^ [ln/At.s(C)l ■ 

The correlation of ry^’s is given by 


Corr^[7ye,rye.] := Corrj [ln/At,e(C),ln/At,e'(0| H{t)\. 


Equivalently, we can write Eq. (F.2) as 


pb 

Axt ,0 « AtgAt,eit) -I- VMWAt.o + At (eee'e’^*'»'-®‘.» _ ^0', 

J a 


where VTAt := (bFAt,e)eG[a,b]■ For 6 G [a,b], bFAt,e is a Gaussian process in the 
sense that any finite subset of points {0i,..., 0„} G [a, 6] have a multivariate Gaussian 
distribution with mean zero and covariance, A,At .00 y given by 


AAt.oe' = ^Gov^ [ln/At.e(C),ln/At,e'(OI • 

Finally, taking the limit At —>■ 0, and assuming that the limits 

g 0 {t) := lim gAt.eit), and := lim EAt,es'(i), (F.3) 

At—>-0 At—>-0 

are well defined, we obtain the system of SDEs 

dx 0 = g 0 {t)dt + dW 0 {t) + ( - e^'e) d0'dt, (F.4) 
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or equivalently as the system of SDEs 

dx = g{t)dt + A(t)dWt + K{x)dt, 

where g(t) = and A(t)A(t)^ = E(t) are defined using the limits in 

Eq. (E.3), K{x) = — eg> 0 )dO', and the components of W* are indepen¬ 

dent Wiener processes. 

While we have formally taken the limit of the discrete Eq. (F.2), it is important 
to note that establishing the well-posedness of stochastic integrodifferential equations 
is not straight-forward. Conditions for the existence and uniqueness of solutions to 
certain nonlinear stochastic partial differential equations (SPDEs) are demonstrated in 
Ch.7 of [15]. This approach considers the solutions to SPDEs to be random processes 
that take their values in a Hilbert space of functions. Recently, this concept has 
been extended to provide general conditions on the constituent functions of stochastic 
neural fields to ensure the existence of solutions [19,25]. The form of stochastic neural 
fields is closely related to Eq. (F.4), since both types of equation possess a linear drift 
and a convolution defining a nonlocal coupling between their state variables. It may be 
possible to utilize these previous approaches to establish the existence and uniqueness 
of solutions to Eq. (F.4) in future studies. 
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